Noisy Text Analytics

نویسنده

  • L. Venkata Subramaniam
چکیده

Text produced by processing signals intended for human use is often noisy for automated computer processing. Digital text produced in informal settings such as online chat, SMS, emails, tweets, message boards, newsgroups, blogs, wikis and web pages contain considerable noise. Also processing techniques like Automatic Speech Recognition, Optical Character Recognition and Machine Translation introduce processing noise. People are adept when it comes to pattern recognition tasks involving typeset or handwritten documents or recorded speech, machines less-so.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How Much Noise in Text is too Much: A Study in Automatic Document Classification

Noise is a stark reality in real life data. Especially in the domain of text analytics it has a significant impact as data cleaning forms a very large part (upto 80% time) of the data processing cycle. Noisy unstructured text is common in informal settings such as on-line chat, SMS, email, newsgroups and blogs, automatically transcribed text from speech data, and automatically recognized text f...

متن کامل

Text Analytics of Customers on Twitter: Brand Sentiments in Customer Support

Brand community interactions and online customer support have become major platforms of brand sentiment strengthening and loyalty creation. Rapid brand responses to each customer request though inbound tweets in twitter and taking proper actions to cover the needs of customers are the key elements of positive brand sentiment creation and product or service initiative management in the realm of ...

متن کامل

Company Mention Detection for Large Scale Text Mining

Text mining on a large scale that addresses actionable prediction needs to content with noisy information in documents, and with interdependencies between the kinds of NLP techniques applied and the data representation of instances. This paper presents an initial investigation of the impact of improved company mention detection for financial analytics. Coverage of company mention detection impr...

متن کامل

Analytics for Noisy Unstructured Text A Data II

The importance of text mining applications is growing proportionally with the exponential growth of electronic text. Along with the growth of internet many other sources of electronic text have become really popular. With increasing penetration of internet, many forms of communication and interaction such as email, chat, newsgroups, blogs, discussion groups, scraps etc. have become increasingly...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010